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ABSTRACT 


1. Introduction 


Automated Time and Attendance marking system can help schools and higher education in many ways. There is no 
doubt that an attendance management system will surely save lot of time and money by eliminating a great deal of 
manual processes involved in attendance and leave entry and calculating hours attended. Since the attendance is 
automated, the data is accurate and error-free. Once the attendance is marked, the captured data gets stored in the 
student attendance system. This feature is especially useful while locating a particular student or while analysing 
trends. A face recognition system is a computer application capable of identifying or verifying a person from a 
digital image. One of the ways to do this is by comparing selected facial features from the image and a face 
database. Automated Attendance System is the advancement that has taken place in the field of automation 


replacing traditional attendance marking activity. 


Generally an Attendance Management System which is developed using bio-metrics, but in our case we collect the 


facial data base using face detecting algorithm and Classifying them using Eigen faces approach. 
2. Literature Survey 


Henry A. Rowley, Shumeet Baluja, and Takeo Kanade have developed neural network-based approach to detect 
frontal views of faces in gray-scale photographs. The algorithm and training methods are ubiquitous, and may be 
used to recognise various types of faces as well as related objects and patterns. This saves the time-consuming task 
of manually selecting non-face training examples, which must be picked to cover the full non-face picture space. 


They discovered that the technology can detect 90.5 percent of faces. 


Henry Schneiderman and Takeo Kanade described a statistical method for 3D object detection. They used a product 


of histograms to express the statistics of object and non-object appearance. The combined statistics of a selection of 
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wavelet coefficients and their position on the object are represented by each histogram. They used a variety of 


histograms to demonstrate a wide range of visual qualities. They built the first algorithm that can consistently 
recognise human faces with out-of-plane rotation using this model. They used a test set of 208 photos containing 


441 faces, 347 of which are in profile view, and the greatest accuracy observed was around 92.7 percent. 


Paul Viola and Michael Jones developed a frontal face identification system that was completely different from 
earlier systems in terms of its ability to recognise faces quickly. This method employs a new picture representation 
known as an integral image, which aids in the rapid evaluation of features. A method for building a classifier that 
involves utilising AdaBoost to choose a small number of significant features and adding gradually more composite 
classifiers in a cascade structure to rapidly boost the detector's speed. Finally, the detection rate was discovered to 


be 90%. 


Ajay Kumar Bansal and Pankaj Chawla demonstrated the concept of the performance of face recognition using the 
Principal Component Analysis (PCA) technique. The tests were done on the Otorhinolaryngology (ORL) database, 
the Indian face database, and the Georgia Tech face database, which all have different expressions, poses, and facial 
characteristics. In the experimental setup, the number of training photos for all databases was altered from 80% to 
40%, which means that originally 80% of total images were used in training and 20% for testing, and then the ratio 
was varied as 60/40, then 40/60. The comparison clearly reveals that PCA accuracy is 92.50 percent, 74.17 percent, 
and 61.25 percent. 


Mrunmayee Shirodkar, Varun Sinha, Urvi Jain, and Bhushan Nemade have presented a system for students that 
include features such as face detection, feature extraction, feature detection, attendance analysis, and monthly 
attendance report production. For face detection, the suggested system uses image contrasts, integral images, 
Ada-Boost, Haar-like features, and a cascade classifier. Faces are recognised using an improved Local Binary 
Pattern (LBP) that is utilised to recognise student faces utilising a database comprising photos of students. That 
technology assisted them in achieving outcomes with a higher accuracy of 88.08 percent and less time consumption 


thanks to a distinct mix of algorithms. 


Suman Kumar Bhattacharyya and Kumar Rahul used LDA for Facial Recognition. They've tried using the ORL 
face database. By using a linear discriminant criterion, the LDA approach overcomes the limitations of the 
Principle Component Analysis (PCA) method. This criterion aims to maximise the ratio of the determinant of the 
projected samples' between-class scatter matrix to the determinant of the projected samples’ within-class scatter 


matrix. The final experimental findings show a 92.5 percent recognition rate or correctness. 
3. Face Recognition System 

This work has three main steps: 

(1) Building database using Viola Jones Algorithm. 

(2) Feature Extraction using Eigen faces and Classification using Euclidean distance. 


(3) Updating Attendance. 
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3.1. Building database using Viola-Jones Algorithm 


First step of our project is to build database i.e. obtain only face from image. In simple terms we need a face locator. 
Basically a face locator needs to tell whether a photo of optional size contains a human face and give if that is valid 
or not and also it should tell where the face is located in the image. For this, we are using Viola Jones algorithm 
which is named after two Computer Vision Researchers Paul Viola and Michael Jones. They proposed this method 


in 2001 in their paper “Rapid Object Detection using a Boosted Cascade of Simple Features”. 


This algorithm works well for frontal faces rather than faces looking upwards, sideways or downwards. First the 
image is converted into a grayscale because it is easier to work with gray scale images rather than colored and there 
will be lesser data to process with gray-scale images. This algorithm looks at many smaller sub-regions and tries to 
find a face by looking for specific features in each sub-region. It needs to check many different positions and scales 


because an image can contain many faces of various sizes. 


Fig.1. Detection of Haar features 


As you can see in above figure Viola-Jones outlines a box and searches for a face within the box. It is essentially 
searching for these Haar-like features. The box moves a step to the right after going through every tile in the picture. 
As you can see if face features are detected the box turns green else it is red.. Generally we can change the box size 
and step size according to our needs. If we take smaller steps, then many number of boxes detect face-like features 
(Haar-like features) and the data of all those boxes when aggregated helps the algorithm to determine where the 


face is located in image. 


Viola-Jones algorithm 


Fig.2. Block diagram for building database using Viola-Jones 
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Fig.3. Database for each user 
Each folder has consists of c number of images for each user. 


3.2. Feature Extraction using eigenfaces and Classification using Euclidean distance 
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Fig.4. Block diagram for Recognition and Updation 


Here we create a folder for each user and store the facial images of that user in that folder. We have created 10 
folders and named them as s1, s2, and so on upto s10. Each folder has 150 images of that particular user. These 150 
images are of different pose. For example, s1 consists of 150 facial images of person | and similarly s2 contains 
150 images of person 2. These 150 images are later used for testing and training. These facial images are collected 
using Viola Jones algorithm. Each image should be of same size. We have to take the size of image as 92 pixels 


width and 112 pixels height. The output of the detection part will be like given in the below figure. 
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Fig.5. Flowchart 


Here we create a folder for each user and store the facial images of that user in that folder. We have created 10 
folders and named them as s1, s2, and so on upto s10. Each folder has 150 images of that particular user. These 150 
images are of different pose. For example, s! consists of 150 facial images of person | and similarly s2 contains 
150 images of person 2. These 150 images are later used for testing and training. These facial images are collected 
using Viola Jones algorithm. Each image should be of same size. We have to take the size of image as 92 pixels 
width and 112 pixels height. The output of the detection part will be like given in the below figure. 
Original Image ? __ Output Detected Image 
Hi Figure 1 - oo & 
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Fig.6. Output of Face detection 
3.2.1. Extracting Eigen values 


Choose an image of face I (x,y) in two dimensional N *N arrangements. Then convert the image into the dimension 
of N’ *1. Treat initially a training set of N N* images and it can be converted into N’ *1 dimensions. Now a training 


set of N’*M dimensions is created. Here the number of samples is M. Let I, [2 13, .....[yy be the face images in 
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the training set. Next, compute the mean for the images in the training set. At that point, average face of the set is 


defined by, 


re T; 
ya (1) 


Here, ois the Normalized Image Coefficient introduced to improve verification chances even for lowest matching 


of random image with Eigen face image and is well-defined as square root of the sum of the squares of difference 


between images and average face image divided by i. 


ts [yh , ,1=1,2,3,....M (2) 


By using the results of the above step the matrix A is computed. 


1:€;; A =[0, 2 weewe ou] 
The matrix C (co-variance matrix) is formed by multiplying A by its transpose. 
C=A*Al (3) 


Therefore the “dimension of the matrix C is N? * N’ and determining the N? Eigen vectors and Eigen values is an 
intractable task for typical image sizes”. In the event that the quantity of information focuses in the picture space is 
not exactly the element of the space (M<N’). There will be just M—1 , as opposed to N’ important Eigen vectors. In 
this way, consider a matrix Q=A‘ A of dimension N*N . Next compute the Eigen values and Eigen vectors of the 


matrix Q . Consider Eigen values pi; and Eigen vectors v; of Q such that 

ATAV= Hi Vi 

A(ATAY, )=WAV; 

C.Ave=yiAV; 

Therefore, the Eigen vectors of co-variance matrix C is Av; 

Express the Eigen vectors v; as a linear combination of M training set face images to compute the Eigen faces uj, 
Where, 

U= 4 Vicd 1=1,2,3,...M, in particular 

U=ViOi+Vi2b.+V1303+...... +Vvimom 


Us= V2101+V2292+V2303+... +V2m0m 


Um= VoM101+Vé202+VM303+ teaser +Vvm0m 


After acquiring the Eigen faces, the photographs in the database are predicted into the Eigen faces space, and loads 


of the pictures in that space are stored. 
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Fig.7. Original images and corresponding Eigen Faces 


The weights of the image on the Eigen face space are calculated by using the formula, 
Wu; TP) (4) 


To recognize the face first calculate the Euclidean distance between the Eigen face of the image and Eigen faces 


stored previously. 


For an unknown image’, calculate 6 and “=X, wiu; where w; =u;". Finally calculate the Euclidean distance 


between 9 and 6%. 


If the Euclidean distance is minimum and it is below the threshold value, then the unknown image will be 


identified. Otherwise the unknown image will not be recognized. 


In all of the above cases i varies from 1,2,3,......n. We have done deep investigation and examination of key 
features, which will play primal role in framing concluding the results of image identification over the data base. 
The distinct image processing formulae has been framed and tested with small sample of images. Finally we 
concluded last one is the best for identification of random image, even it is having very less common features with 
mean/original image. So we concentrated and continued on the last one i.e. to prove the image metrics using this 
parameter . This will surly give greater Euclidean distance and Mean square error in order to match the random 
image with Eigen face image. Random image is the image of particular person at any moment. So using this 


formula, it is easy to trim down the error values like Euclidean distance and Mean square error. 
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Fig.8. Block Diagram for Calculation of Eigen Faces 
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3.3. Updating Attendance 


After recognizing the face, the last step is to update the attendance in the excel sheet in the corresponding column 


based on the class number. 
4. Performance Metrics 


Confusion matrix can be defined as “visualizing the performance of your prediction model in tabular form”. Each 
entry in a confusion matrix implies the number of predictions made by the model whether the classification is done 
correctly or incorrectly. Classification based on alone accuracy can be misleading if you have an unequal number of 


observations in each class or if we have more than two classes in our dataset. 


True Positive (TP): 

It is concerned with the amount of times the classifier correctly predicts the positive class to be positive. 
True Negative (TN): 

It is concerned with the amount of times a classifier correctly predicts a negative class as negative. 
False Positive (FP): 

It is concerned with the number of times the classifier has predicted the negative class as positive. 

False Negative (FN): 


It is concerned with the amount of times the classifier incorrectly forecasts the positive class as negative. 


Confusion matrix should always be used as an evaluation criterion for machine learning models. It provides a model 
performance metric that is both simple and effective. Here are some of the most prevalent confusion matrix-derived 


performance measures. 


Accuracy: It is the percentage of total samples correctly categorised by the classifier. Use the formula 


(TP+TN)/(TP+TN+FP+EN) to calculate accuracy. 


Misclassification Rate: It gives the information about what fraction of predictions were incorrect. It is also known 


as Classification Error. You can calculate using (FP+FN)/(TP+TN+FP+FN) or (1-Accuracy). 
5. Results & Conclusion 


In this approach, automated attendance system using facial recognition is described and proposed model used 
Viola-Jones for face detection and to build our database which is used for further process. And using Eigen Faces, 
features are extracted and using Euclidean distance as classifier, input face is recognized and finally updated the 
attendance sheet accordingly. The proposed model has a database with 10 users, and for each user model has taken 
150 images. These 150 images of each user are used for both training and testing. And 80% of data is used for 
training and 20% for testing. So, for each user 120 images are taken for training and 30 images for testing. And the 


system has considered 5 users and used their data present in database for calculating TP, TN, FP, FNs. 


Face recognition: This is one of the matched results in which input image is captured instantly and the output 


image is recognized from the training dataset. 
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Fig.9. Correctly, Incorrectly recognized result and Confusion chart 


Updating Attendance: This is finally updated automated attendance excel sheet in which, we can get total number 


of present days for each student and can analyse the attendance percentage for each student. 
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